15 research outputs found
A Federated Learning Benchmark for Drug-Target Interaction
Aggregating pharmaceutical data in the drug-target interaction (DTI) domain
has the potential to deliver life-saving breakthroughs. It is, however,
notoriously difficult due to regulatory constraints and commercial interests.
This work proposes the application of federated learning, which we argue to be
reconcilable with the industry's constraints, as it does not require sharing of
any information that would reveal the entities' data or any other high-level
summary of it. When used on a representative GraphDTA model and the KIBA
dataset it achieves up to 15% improved performance relative to the best
available non-privacy preserving alternative. Our extensive battery of
experiments shows that, unlike in other domains, the non-IID data distribution
in the DTI datasets does not deteriorate FL performance. Additionally, we
identify a material trade-off between the benefits of adding new data, and the
cost of adding more clients
Model-Agnostic Federated Learning
Since its debut in 2016, Federated Learning (FL) has been tied to the inner
workings of Deep Neural Networks (DNNs). On the one hand, this allowed its
development and widespread use as DNNs proliferated. On the other hand, it
neglected all those scenarios in which using DNNs is not possible or
advantageous. The fact that most current FL frameworks only allow training DNNs
reinforces this problem. To address the lack of FL solutions for non-DNN-based
use cases, we propose MAFL (Model-Agnostic Federated Learning). MAFL marries a
model-agnostic FL algorithm, AdaBoost.F, with an open industry-grade FL
framework: Intel OpenFL. MAFL is the first FL system not tied to any specific
type of machine learning model, allowing exploration of FL scenarios beyond
DNNs and trees. We test MAFL from multiple points of view, assessing its
correctness, flexibility and scaling properties up to 64 nodes. We optimised
the base software achieving a 5.5x speedup on a standard FL scenario. MAFL is
compatible with x86-64, ARM-v8, Power and RISC-V.Comment: Published at the EuroPar'23 conference, Limassol, Cypru
Experimenting with Emerging ARM and RISC-V Systems for Decentralised Machine Learning
Decentralised Machine Learning (DML) enables collaborative machine learning
without centralised input data. Federated Learning (FL) and Edge Inference are
examples of DML. While tools for DML (especially FL) are starting to flourish,
many are not flexible and portable enough to experiment with novel systems
(e.g., RISC-V), non-fully connected topologies, and asynchronous collaboration
schemes. We overcome these limitations via a domain-specific language allowing
to map DML schemes to an underlying middleware, i.e. the \ff parallel
programming library. We experiment with it by generating different working DML
schemes on two emerging architectures (ARM-v8, RISC-V) and the x86-64 platform.
We characterise the performance and energy efficiency of the presented schemes
and systems. As a byproduct, we introduce a RISC-V porting of the PyTorch
framework, the first publicly available to our knowledge
RISC-V-Based Platforms for HPC: Analyzing Non-functional Properties for Future HPC and Big-Data Clusters
High-Performance Computing (HPC) have evolved to be used to perform simulations of systems where physical experimentation is prohibitively impractical, expensive, or dangerous. This paper provides a general overview and showcases the analysis of non-functional properties
in RISC-V-based platforms for HPCs. In particular, our analyses target the evaluation of power and energy control, thermal management, and reliability assessment of promising systems, structures, and technologies devised for current and future generation of HPC machines. The main set of design methodologies and technologies developed within the activities of the Future and HPC & Big Data spoke of the National Centre of HPC, Big Data and Quantum Computing project are described along with the description of the testbed for experimenting two-phase cooling approaches